Regulating Greed Over Time

نویسندگان

  • Stefano Traca
  • Cynthia Rudin
چکیده

In retail, there are predictable yet dramatic time-dependent patterns in customer behavior, such as periodic changes in the number of visitors, or increases in visitors just before major holidays (e.g., Christmas). The current paradigm of multi-armed bandit analysis does not take these known patterns into account, which means that despite the firm theoretical foundation of these methods, they are fundamentally flawed when it comes to real applications. This work provides a remedy that takes the time-dependent patterns into account, and we show how this remedy is implemented in the UCB and ε-greedy methods. In the corrected methods, exploitation (greed) is regulated over time, so that more exploitation occurs during higher reward periods, and more exploration occurs in periods of low reward. In order to understand why regret is reduced with the corrected methods, we present a set of bounds that provide insight into why we would want to exploit during periods of high reward, and discuss the impact on regret. Our proposed methods have excellent performance in experiments, and were inspired by a high-scoring entry in the Exploration and Exploitation 3 contest using data from Yahoo! Front Page. That entry heavily used time-series methods to regulate greed over time, which was substantially more effective than other contextual bandit methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Greed and Fear in Network Reciprocity: Implications for Cooperation among Organizations

Extensive interdisciplinary literatures have built on the seminal spatial dilemmas model, which depicts the evolution of cooperation on regular lattices, with strategies propagating locally by relative fitness. In this model agents may cooperate with neighbors, paying an individual cost to enhance their collective welfare, or they may exploit cooperative neighbors and diminish collective welfar...

متن کامل

Exercise for slimming

The pejorative aphorism Obesity is the result of greed and sloth has come to be applied to the fat individual. The inferences implied are widely accepted not only by the public and journalists but by the health-care professional and the nutritional scientist too. Is this justified particularly with respect to sloth? Of course, fat deposition must be the result of excess intake of energy in rela...

متن کامل

Exercise for slimming

The pejorative aphorism Obesity is the result of greed and sloth has come to be applied to the fat individual. The inferences implied are widely accepted not only by the public and journalists but by the health-care professional and the nutritional scientist too. Is this justified particularly with respect to sloth? Of course, fat deposition must be the result of excess intake of energy in rela...

متن کامل

Greed, fear and stock market dynamics

We present a behavioral stock market model in which traders are driven by greed and fear. In general, the agents optimistically believe in rising markets and thus buy stocks. But if stock prices change too abruptly, they panic and sell stocks. Our model mimics some stylized facts of stock market dynamics: (1) stock prices increase over time, (2) stock markets sometimes crash, (3) stock prices s...

متن کامل

Does greed help a forager survive?

We investigate the role of greed on the lifetime of a random-walking forager on an initially resource-rich lattice. Whenever the forager lands on a food-containing site, all the food there is eaten and the forager can hop S more steps without food before starving. Upon reaching an empty site, the forager comes one time unit closer to starvation. The forager is also greedy-given a choice to move...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1505.05629  شماره 

صفحات  -

تاریخ انتشار 2015